Visualization concept created by Leland Wilkinson (The Grammar of Graphics, 1999)
attempt to taxonimize the basic elements of statistical graphics
Adapted for R by Hadley Wickham (2009)
consistent and compact syntax to describe statistical graphics
highly modular as it breaks up graphs into semantic components
ggplot2 is not meant as a guide to which graph to use and how to best convey your data (more on that later), but it does have some strong opinions.
A statistical graphic is a…
mapping of data
which may be statistically transformed (summarized, log-transformed, etc.)
to aesthetic attributes (color, size, xy-position, etc.)
using geometric objects (points, lines, bars, etc.)
and mapped onto a specific facet and coordinate system
Measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex.
# A tibble: 344 × 8
species island bill_le…¹ bill_…² flipp…³ body_…⁴ sex
<fct> <fct> <dbl> <dbl> <int> <int> <fct>
1 Adelie Torgersen 39.1 18.7 181 3750 male
2 Adelie Torgersen 39.5 17.4 186 3800 fema…
3 Adelie Torgersen 40.3 18 195 3250 fema…
4 Adelie Torgersen NA NA NA NA <NA>
5 Adelie Torgersen 36.7 19.3 193 3450 fema…
6 Adelie Torgersen 39.3 20.6 190 3650 male
7 Adelie Torgersen 38.9 17.8 181 3625 fema…
8 Adelie Torgersen 39.2 19.6 195 4675 male
9 Adelie Torgersen 34.1 18.1 193 3475 <NA>
10 Adelie Torgersen 42 20.2 190 4250 <NA>
# … with 334 more rows, 1 more variable: year <int>, and
# abbreviated variable names ¹bill_length_mm,
# ²bill_depth_mm, ³flipper_length_mm, ⁴body_mass_g
Start with the
penguinsdata frame
Start with the
penguinsdata frame, map bill depth to the x-axis
{r penguins-1, fig.show = "hide", warning = FALSE}\ #| code-line-numbers: "2" ggplot( data = penguins, mapping = aes(x = bill_depth_mm) )
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis.
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point.
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”, add the subtitle “Dimensions for Adelie, Chinstrap, and Gentoo Penguins”
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”, add the subtitle “Dimensions for Adelie, Chinstrap, and Gentoo Penguins”, label the x and y axes as “Bill depth (mm)” and “Bill length (mm)”, respectively
Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”, add the subtitle “Dimensions for Adelie, Chinstrap, and Gentoo Penguins”, label the x and y axes as “Bill depth (mm)” and “Bill length (mm)”, respectively, label the legend “Species”
ggplot(
data = penguins,
mapping = aes(
x = bill_depth_mm,
y = bill_length_mm
)
) +
geom_point(
mapping = aes(color = species)
) +
labs(
title = "Bill depth and length",
subtitle = paste("Dimensions for Adelie,",
"Chinstrap, and Gentoo",
"Penguins"),
x = "Bill depth (mm)",
y = "Bill length (mm)",
color = "Species"
) Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”, add the subtitle “Dimensions for Adelie, Chinstrap, and Gentoo Penguins”, label the x and y axes as “Bill depth (mm)” and “Bill length (mm)”, respectively, label the legend “Species”, and add a caption for the data source.
ggplot(
data = penguins,
mapping = aes(
x = bill_depth_mm,
y = bill_length_mm
)
) +
geom_point(
mapping = aes(color = species)
) +
labs(
title = "Bill depth and length",
subtitle = paste("Dimensions for Adelie,",
"Chinstrap, and Gentoo",
"Penguins"),
x = "Bill depth (mm)",
y = "Bill length (mm)",
color = "Species",
caption = "Source: palmerpenguins package"
)Start with the
penguinsdata frame, map bill depth to the x-axis and map bill length to the y-axis. Represent each observation with a point and map species to the color of each point. Title the plot “Bill depth and length”, add the subtitle “Dimensions for Adelie, Chinstrap, and Gentoo Penguins”, label the x and y axes as “Bill depth (mm)” and “Bill length (mm)”, respectively, label the legend “Species”, and add a caption for the data source. Finally, use the viridis color palete for all points.
ggplot(
data = penguins,
mapping = aes(
x = bill_depth_mm,
y = bill_length_mm
)
) +
geom_point(
mapping = aes(color = species)
) +
labs(
title = "Bill depth and length",
subtitle = paste("Dimensions for Adelie,",
"Chinstrap, and Gentoo",
"Penguins"),
x = "Bill depth (mm)",
y = "Bill length (mm)",
color = "Species",
caption = "Source: palmerpenguins package"
) +
scale_color_viridis_d()Often we omit the names of first two arguments when building plots with ggplot().
Commonly used characteristics of plotting geometries that can be mapped to a specific variable in the data, examples include:
x, y)colorshapesizealpha (transparency)Different geometries have different aesthetics that can be used - see the ggplot2 geoms help files for listings.
Aesthetics given in ggplot() apply to all geoms.
Aesthetics for a specific geom_*() can be overridden via mapping or as an argument.
Mapped to a different variable than color
Mapped to same variable as color
Using a fixed value (note this value is outside of the aes call)
Mapped to a variable
aes() and pass as mapping argument to ggplot() or geom_*().geom_*() as an argument.From the previous slide color, shape, and alpha are all aesthetics while size was a setting.
Smaller plots that display different subsets of the data
Useful for exploring conditional relationships and large data
Sometimes referred to as “small multiples”
Recreate, as faithfully as possible, the following plot using ggplot2 and the penguins data.
Recreate, as faithfully as possible, the following plot from the palmerpenguin package readme in ggplot2.
Sta 523 - Fall 2022